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[57] ABSTRACT 

An information storage system selects target segments for 
garbage collection only if their age in the information 
storage system exceeds an age threshold value and, once 
past the age threshold, in the order of least utilized segments 
first. The system determines the age of a segment by 
determining the amount of time a segment has been located 
in direct access storage devices (DASD) of the information 
storage system and considers a segment for garbage collec- 
tion only after the segment has been located in DASD for the 
selected age threshold value. From the set of candidate 
segments, the system chooses one or more for garbage 
collection in the order in which they will yield the most free 
space. The free space yield is determined by utilization data, 
so that the least utilized segments are garbage-collected first. 

43 Claims, 5 Drawing Sheets 
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GARBAGE COLLECTION IN LOG- 
STRUCTURED INFORMATION STORAGE 
SYSTEMS USING AGE THRESHOLD 
SELECTION OF SEGMENTS 

BACKGROUND OF THE INVENTION $ 

1. Field of the Invention 

This invention relates generally to log-structured infor- 
mation storage systems of direct access storage devices 
(DASD) and, more particularly, to garbage collection of 10 
segments in log-structured storage systems. 

2. Description of the Related Art 

To store increasing amounts of information, many com- 
puters use external information storage systems. These sys- 15 
terns can provide improved write performance and data 
redundancy over conventional disk storage configurations. 
The external storage systems typically have a dedicated 
controller that manages read and write operations to the 
storage system. Such systems can more efficiently store 2() 
large blocks of information and can provide redundant 
information storage in a manner that is transparent to the 
computer. 

Some external storage systems maintain information as 
log-structured files as described in "The Design and Imple- 2 s 
mentation of a Log-Structured File System" by M. Rosen- 
blum and J.K. Ousterhout, ACM Transactions on Computer 
Systems, Vol. 10 No. 1, February 1992, pages 26-52. In a 
log-structured file system, information is stored in a direct 
access storage device (DASD) according to a "log" format, 30 
as if being written to an infinite or near-infinite tape. A 
DASD may comprise, for example, a magnetic disk. 
Typically, new information is stored at the end of the log 
rather than updated in place, to reduce disk seek activity. As 
information is updated, portions of data records at interme- 35 
diate locations of the log become outdated. 

One type of log-structured storage system is called a log 
structured array (LSA), obtained by combining the log- 
structured file system architecture with a disk array archi- 
tecture such as the well-known RAID architecture described 40 
in "A Case for Redundant Arrays of Inexpensive Disks 
(RAID)", Report No. UCB/CSD 87/391, December 1987, 
Computer Sciences Division, University of California, 
Berkeley, Calif. In an LSA system, an LSA control unit 
manages information storage to write updated data into new 45 
disk locations rather than writing new data in place. Large 
amounts of updated data are collected in LSA control unit 
memory and are written to disk storage at the same time. As 
updated information (called "live" data) is stored to disk, the 
disk locations of the old data are no longer valid. The old 50 
data is referred to as "garbage" or "dead" data. Units of disk 
storage, called segments, thereby become partially empty. 
To ensure a constant supply of disk space for storage of 
updated information, the LSA controller periodically per- 
forms a garbage collection process in which partially empty 55 
segments are compacted into a fewer number of completely 
filled segments, thereby creating a number of completely 
empty segments that are ready for updated information. 

Reading and writing into an LSA occurs under manage- 
ment of the LSA control unit, also called a controller. An 60 
LSA control unit can include resident microcode that emu- 
lates logical devices such as DASD disk drives, or tape 
drives. In this way, the physical nature of the external 
storage subsystem can be transparent to the operating system 
and to the applications executing on the computer processor 65 
accessing the LSA. Thus, read and write commands sent by 
the computer processor to the external information storage 
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system would be interpreted by the LSA controller and 
mapped to the appropriate DASD storage locations in a 
manner not known to the computer processor. This com- 
prises a mapping of the LSA logical devices onto the actual 
DASDs of the LSA. 

In an LSA, data is stored among the multiple DASDs of 
the LSA and the memory in which updated data is tempo- 
rarily collected, or buffered, is called the input write buffer. 
The input write buffer typically contains one segment's 
worth of data and also is referred to as the memory segment. 
When the LSA input write buffer is filled, the new data 
stored in the buffer is recorded sequentially back into the 
DASDs of the LSA. Such an arrangement eliminates most 
DASD seek operations during data recording. 

As an illustration, consider an LSA consisting of a group 
of disk drive DASDs, each of which includes multiple disk 
platters stacked into a column. The recording area of each 
DASD in a group is divided into multiple areas having a 
logical (virtual) designation called a segment-column. For 
example, a segment-column of a DASD in an LSA is an area 
comprising all of the same-position tracks on all platters of 
the DASD. A segment is the collection of all segment- 
columns from all the DASDs in the LSA. Thus, a disk drive 
DASD unit in an LSA typically includes as many segment- 
columns as there are tracks on a disk platter. For example, 
if an LSA includes five DASD units, then the first track on 
each of the DASD platters in the first DASD is a segment- 
column, the first track on each of the DASD platters in the 
second DASD is another segment-column, and so forth. The 
first segment-column from each of the five DASDs in the 
LSA would form one logical segment. Therefore, an LSA 
typically has as many segments as there are segment- 
columns in a single disk drive unit. 

Many conventional multiple-platter disk drive systems 
number tracks sequentially from platter to platter of a disk 
drive unit. That is, conventionally the innermost track on the 
first platter is track 1, the innermost track on the second 
platter is track 2, and so forth such that the innermost track 
on the last (fifth) platter is track 5. Thus, the second track on 
the first platter of a five-platter disk drive unit would be track 
6, the second track on the second platter would be track 7, 
the third track on the first platter would be track 11, the third 
track on the second platter would be track 12, and so forth. 
Thus, the first LSA segment would comprise the collection 
of the innermost track on each platter (the first segment- 
column) from the first disk drive unit, the first segment- 
column from the second drive, and so forth through the fifth 
drive, the second LSA segment would comprise the collec- 
tion of the second segment-column (second track) from all 
of the disk drives, and so forth. Except for the track 
numbering convention, the recording area relationship 
between segments and segment-columns would be as 
described above. 

One segment-column per segment in an LSA is typically 
used to store parity information that is produced from a 
logical exclusive- OR operation on data stored in the remain- 
ing data segment-columns of the segment. For improved 
performance, the segment-columns containing the parity 
information are not all stored on the same disk drive unit, but 
are rotated among the disk drive units. This ensures accurate 
data rebuild in the event of a disk failure. 

Whether an LSA stores information according to a vari- 
able length format such as a count-key-data (CKD) archi- 
tecture or according to a fixed block architecture, the LSA 
storage format of segment-columns is mapped onto the 
physical storage space in the disk drive units so that a logical 
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track of the LSA is stored entirely within a single segment- before the temporary storage buffer will be full. After the 

column mapped onto a disk drive unit of the array. The size temporary storage buffer becomes full, the data from the 

of a logical track is such that many logical tracks can be buffer is recorded back into an empty segment in the disk 

stored in the same LSA segment-column. It should be storage array. 

understood that the description above of a segment-column 5 As garbage collection proceeds, live data from the various 

containing all same -position tracks of all platters of a DASD target segments is read into the temporary storage buffer, the 

is for illustration, and other definitions of segment-column buffer fills up, and the live data is stored back into an empty 

are possible. segment of the DASD array. After the live data in the 

Because the input write buffer of an LSA such as temporary storage buffer is written back into the DASD 

described above typically has a storage capacity of appro xi- 10 array, the segments from which the live data values were 

mately one logical segment, the data in the write buffer and read are designated as being empty. In this way, live data is 

the parity segment-column computed from it together com- consolidated into a fewer number of completely full seg- 

prise approximately one segment's worth of information. ments ^ new em P tv segments are created. Typically, 

When the input write buffer becomes substantially full, the garbage collection is performed when the number of empty 

LSA controller computes the parity segment-column for the 15 segments in the array drops below a predetermined threshold 

data in the write buffer and records the data and parity value. 

information into the next available empty segment mapped ™ e wa y ^ which ^get segments are selected for the 

onto the array. That is, the first segment-column of the input garbage collection process affects the efficiency of LSA 

write buffer is written into the first segment^olumn of the operation. The LSA controller must determine how to collect 

next available segment, the second segment-column of the 9n s ^™nts when performing the garbage collection. Two 

input write buffer is written into the second segment-column ^r^s are used conventionally, one ca ed the greedy 

of the same next segment, the third segment-column of the ^onShm and one called the cost-benefit algorithm. The 

input write buffer is written into the third segment-column of f eed y al f ^ selects target segments by determining 

the same next segment, and the process is repeated to the last how m ^ h fr f **** will be achieved for each segment 

segment-column processed and then processing segments in the order that 

A U1 . * , . , , t - . iL , 25 will yield the most amount of free space. The cost-benefit 
A block that contains data values for which there have , \. . . t \ , , 
, , , .1 , ,% t t t algorithm compares a cost associated with processing each 
been later wnte operations, meaning that the data values 0 . • * u j 1 * * r 
. . a a m 1 t»i t a- a segment against a benefit and selects segments for process- 
have been superseded, is available for recording new data. ■ u j *u u . * 
1 , j « ' j j j * £ j ing based on the best comparisons. 
As noted above, such superseded data is referred to as » - A . , i , , . , 
garbage (or "dead") and the corresponding disk area is 30 P«^ularly the greedy algorithm selects segments 
referred to as a garbage block. A block containing data T ^ S ™ Ue * ^ hzall0D first and mov6S the hve tracks 
values that have not been superseded contains valid data and fr ° m P^iaUy-filled segments to a target segment in a pool 
is referred to as a clean block or a live block. After a number ° f em ^ segments. A problem with the greedy algorithm is 
of data modifying write operations have been carried out in tha ? ! he f toces % Uk f, a !?8 ment to0 ^ b ? 
disk drive units forming a log structured array, there likely 35 waltin e lon 8 e u r f ° r a Partially-filled segment to get older, the 
will be at least one segment's worth of garbage blocks se 8 ment mi e h P l lhen , be e y ea mot , e em W- lf } he f S raen < ' s 
scattered throughout the array. By consolidating live blocks mo , re ^ fcwer hve n data tracks wlU need l ° bc moved > 
with valid data, a fully empty segment can be created, which makin 6 the S arba S e coUection process more efficient, 
will then be available for receiving new (live) data values In me cost-benefit algorithm, a target segment is selected 
from the input write buffer. 40 based on bow much free s P ace ^ available in the segment 
Creating empty segments is important because, for a a « d how much time has elapsed since the segment was last 
controller of an LSA to continue write operations as new fiU f mlh a ™ "&x™hoiL The elapsed time is referred to 
data values are received from the input write buffer, new as of the ^ > ff A -J n tbe algorithm the 
empty segments in the disk drive units must be produced a g 6 , of * f 8 mcnt 15 d6 r fin6d to be , ,he a S 6 of y°^st hve 
continually. New empty segments are typically produced by 45 tl3CklD the Forexam P le > a 8 e ™S h,be "^ cat « d bv 
identifying live blocks within segments containing live data a tu ° 6 s } an ? value associated with a track when ,t a placed 
and moving the hve data from these segments to consolidate m , th * *f A m P u < wnte buffer ' A u benefit-to-cost ratio is 
them in a smaller number of full segments. Such consoli- calculated for each segment, such that the ratio is denned to 
dation creates one or more segments that contain only e * 
garbage blocks. A segment that is entirely garbage is there- 50 

fore empty and is available for recording one segment's benefit _ ^ ~ M *° . 

worth of data from the write buffer, as described above. As cost 1 
noted above, the process of consolidating noncontiguous 

live blocks so as to consolidate live data and create empty where u is called the utilization of the segment; (1-u) is 

segments is called garbage collection. 55 defined to be the percentage amount of free space in the 

Garbage collection is usually done by first locating a segment, also called the "dead" fraction; and a is the age of 

target segment having the fewest number of live data blocks the segment as defined above. The cost-benefit algorithm 

(and therefore the largest number of garbage blocks) in a orders segments by their benefit-to-cost ratio and selects as 

disk drive unit of the LSA. The live data values of the target target segments those with the largest ratios. The numerator 

segment are read into a temporary storage buffer. The target 60 in the ratio represents the benefit to selecting the segment, 

segment therefore becomes completely empty. Next, another being the product of the dead fraction (1-u) and the age a. 

target segment is identified and the live data from that target The denominator (1+u) represents the cost of selecting the 

segment is read into the temporary storage buffer. This segment for garbage collection, because the whole segment 

process of locating target segments and reading their live (all tracks) is read into the buffer and a fractional part u of 

data blocks into the temporary storage buffer is repeated 65 the segment (the live tracks) is written back to DASD. 

segment by segment until the temporary storage buffer is A problem with the cost-benefit algorithm is the overhead 

full. Typically, several target segments must be processed associated with computing the benefit-to -cost ratios for each 
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segment in the LSA and maintaining an ordering of the 
segments according to their benefit-to-cost ratios. The over- 
head quickly becomes prohibitive as the system is scaled 
upward in size. In particular, two segments can switch 
cost-benefit ratios, thereby switching their ordering for gar- 5 
bage collection, simply with the passage of time and without 
regard to any change in actual utilization rate. In this way, 
a segment may have to be re -ordered even though its 
utilization hasn't changed. Note that the benefit (numerator 
above) is a function of age. Thus, a segment may be selected 10 
even though efficiency considerations might suggest that 
other segments with smaller utilization rates should be 
selected for garbage collection first. 

From the discussion above, it should be apparent that 
there is a need for an information storage system that 15 
efficiently manages information storage and performs gar- 
bage collection. The present invention fulfills this need. 

SUMMARY OF THE INVENTION 

The present invention manages an information storage 2Q 
system of a computer to provide a system in which target 
segments are selected for garbage collection only if their age 
in the information storage system exceeds an age threshold 
value and, once past the age threshold, in the order of least 
utilized segments first. The system determines the age of a 25 
segment by determining the amount of time a segment has 
been located in direct access storage devices (DASD) of the 
information storage system and considers a segment for 
garbage collection only after the segment has been located 
in DASD for the selected age threshold value. From the set 30 
of candidate segments, the system chooses one or more for 
garbage collection in the order in which they will yield the 
maximized, most free space. The free space yield may be 
determined by utilization data. In this way, efficiency of 
garbage collection is increased with minimal overhead for 35 
the information storage system. 

An information storage system constructed in accordance 
with the invention performs better than either the greedy 
algorithm or the cost -bene fit algorithm, for the case where 
performance is measured by the average amount of free AQ 
space produced per garbage-collected segment. In addition, 
the age threshold decision process of the invention can be 
implemented at less cost than the cost -benefit algorithm. 
Moreover, a system constructed in accordance with the 
invention can be scaled so that the ordering of segments 45 
according to desirability for garbage collection is maintained 
regardless of the size of the system. 

Other features and advantages of the present invention 
should be apparent from the following description of the 
preferred embodiment, which illustrates, by way of 50 
example, the principles of the invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a representation of a computer system con- 
structed in accordance with the present invention. 5S 

FIG. 2 is a block diagram representation of the garbage 
collection process performed by the computer system illus- 
trated in FIG. 1. 

FIG. 3 is a flow diagram representation of the LSA 
management operations performed by the computer system 6Q 
illustrated in FIG. 1. 

FIG. 4 is a block diagram representation of a bucket 
process performed by the computer system illustrated in 
FIG. 1. 

FIG. 5 is a representation of the GCU vs. normalized age 65 
threshold value with a max-empty value of one and a 
volatility specified by h-0.1 and p-0.9. 
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FIG. 6 is a representation of the GCU vs. normalized age 
threshold value with a max-empty value of one and a 
volatility specified by h=0.1 and p=0.7. 

FIG. 7 is a representation of the GCU vs. normalized age 
threshold value with an m value such that m=0.05 and h=0.1 
and p=0.9. 

FIG. 8 is a representation of the GCU vs. normalized age 
threshold value with an m value such that m=0.01 and h=0.1 
and p=0.9. 

DESCRIPTION OF THE PREFERRED 
EMBODIMENT 

FIG. 1 shows a preferred embodiment of a computer 
system 100 constructed in accordance with the present 
invention. The system 100 includes a processor 102 or host 
computer that communicates with an external information 
storage system 104 having N+l direct access storage devices 
(DASD) in which information is maintained as a log- 
structured array (LSA). In FIG. 1, an array 106 comprising 
four DASDs 106a, 1066, 106c, and \Ud is shown for 
illustration, but it should be understood that the DASD array 
may include a greater or lesser number of DASD. A control 
unit 108 controls the storage of information so that the 
DASD array 106 is maintained as an LSA. Thus, the DASD 
recording area is divided into multiple segment-column 
areas and all like segment-columns from all the DASDs 
comprise one segment's worth of data. The control unit 108 
manages the transfer of data to and from the DASD array 
106 so that periodically it considers segments for garbage 
collection if their age in the array exceeds an age threshold 
value and selects target segments according to the least 
utilized segments first. Thus, utilization information for a 
segment is examined only if the segment is past the age 
threshold value. This reduces the processing overhead for 
the control unit 108. 

LSA OPERATIONS 

The processor 102 includes (not illustrated): one or more 
central processor units, such as a microprocessor, to execute 
programming instructions; random access memory (RAM) 
to contain application program instructions, system program 
instructions, and data; and an input/output controller to 
respond to read and write requests from executing applica- 
tions. The processor 102 may be coupled to local DASD (not 
illustrated) in addition to being coupled to the LSA 104. 
Typically, an application program executing in the processor 
102 may generate a request to read or write data, which 
causes the operating system of the processor to issue a read 
or write request, respectively, to the LSA control unit 108. 

When the processor 102 issues a read or write request, the 
request is sent from the processor to the control unit 108 
over a data bus 110 and is received in the control unit by a 
controller 112. In response, the controller produces control 
signals and provides them over a controller data path 114 to 
an LSA directory 116 and thereby determines where in the 
LSA the data is located, either in a non-volatile LSA data 
cache 118 or in the DASD 106. The LSA controller 112 
comprises one or more microprocessors with sufficient 
RAM to store programming instructions for interpreting 
read and write requests and for managing the LSA 104 in 
accordance with the present invention. 

Data is transferred between the processor 102 and the 
LSA 104 during read operations oyer a path including a read 
data path 120, DASD access circuits 122, the LSA data 
cache 118, controller access circuits 124, the controller data 
path 114, the controller 112, and the data bus 110. Data is 
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transferred during write operations over a path including the being filled. This continues until the segment sO in the LSA 

data bus 110, the controller 112, the controller data path 114, controller memory is filled to capacity, at which time the 

the controller access circuits 124, the LSA data cache 118, segment sO is destaged, meaning that it is moved from the 

the DASD access circuits 122, a segment data path 126, an memory segment buffer 128 and written to the DASD array 

accumulating memory segment input write buffer 128, and s 106. Another segment's worth of data is then filled in the 

a DASD write path 130. controller memory and the process repeats until the next 

The data cache 118 permits delay of write operations on destage operation, 

modified data tracks to the memory segment 128 for pur- ^*ngP^ from [iAdaU cache memory to 

c ■ . . • x az \ \j! i 1 -e ■* DASD in this manner, the DASD storage in the LSA 

poses of maintaining seek affinity. More particularly, if write becQmes denied. That is, after several sequences of 

operations to adjacent tracks are received, then all modified 10 operations, there can be many DASD segments 

data in logically adjacent tracks will be moved into the that afC only pardally fiUcd with livc tracks and otherwise 

memory segment 128 at the same time so they are stored in include dead tracks affecls an JJSA operaung statistic 

the same segment-column. This helps keep together tracks referred to as utilization. 

that are adjacent in the data cache so they will be adjacent At any timC| the utilization of a segment is the fraction of 

when moved into the DASD array, thereby preserving seek 15 ^ conUining ]ive tracks if a segmcnt 

affinity. The advantages and operation of the data cache 118 contains Llive tracks and if the segment capacity is C tracks, 

are described in greater detail in U.S. Pat. No. 5,551,003 then the utilization of the segment is given by 

issued Aug. 27, 1996 and assigned to International Business .„. . T 

Machines Corporation. Utilization^. 

Preferably, the LSA data cache 118 is managed as a 20 ^ writing process described immediately above will 

least-recently-used cache, so that data is queued in the f P the f mpty s ^ m , ents in the DA ? D ^ 

, ... / u . *i 4 j j 4 * *i. * / r *\ 1^6. Therefore, a garbage collection process (described 

cache, with the most recently stored data at the top(orfront) below) ' is * erfor * ed to creat / emptv ^ gments . 

of the queue In particular, the LSA data cache 118 is Qarb coll / ctio / is carried out by choos ^ g a * certain 

organized with clean data tracks in one LRU list and dirty number of p artially . filled target segments in DASD and 

tracks in another LRU list. The clean LRU list specifies comp acting the live tracks in these segments into a fewer 

tracks containing information wherein the data in the LSA number of full segments, thereby creating empty segments. 

cache is the same as the data in the DASD array, and the p or example, if garbage collection is performed on three 

dirty LRU list specifies tracks containing modified data partially empty segments, and each has a 2 /i utilization rate, 

wherein data is different from the data in the DASD array. ^ then the live tracks can be collected and reorganized into two 

A basic operation of the storage system 104 is to write a full segments and one completely empty segment that is 

particular track so as to change the contents of the track. In read y to receive data from the LSA input write buffer 128. 

general, such live data tracks are first placed in the non- a net increase of one empty segment is created by the 

volatile data cache memory 118 of the LSA control unit 108. garbage collection process. 

When the fraction of the cache occupied by modified tracks 35 In the preferred embodiment, the target segments are 

exceeds a predetermined value, the controller 112 logically collected in a garbage collection buffer 131 for compaction 

moves a set number of modified tracks to the memory into the segment buffer 128. Alternatively, the garbage 

segment 128 by assigning them there. After one segment's collected segments can be compacted directly into the 

worth of live tracks are moved into the memory segment, the segment buffer. The segment buffer 128 contains at least two 

tracks are written into contiguous locations of the DASD 4Q physical buffers, each of which can hold one segment of 

array 106. It should be understood that the operation of the data. One physical buffer collects newly written live tracks 

data cache 118 is transparent to the processor 102 and that are received over the data path 126. Another separate 

therefore some operations of the storage system 104 will be physical buffer collects live tracks that were taken from 

described from the perspective of the processor, without garbage collected segments for the purpose of compaction, 

reference to the data cache. Although the inclusion of a data 45 for example, these tracks are received from the garbage 

cache 118 as described above can improve the overall collection buffer 131. When one of these buffers is filled to 

performance of an LSA system, it should be understood that capacity, the contents of the buffer are written to an empty 

the inclusion of a data cache and the details of its imple- segment in the DASD array. Thus, in the preferred 

mentation are not essential to the invention. embodiment, newly-written tracks are placed into segments 

separate from segments used for garbage-collected tracks. 

WRITE OPERATIONS, DESTAGING, & 50 The garbage collection process is typically a low priority, 

GARBAGE COLLECTION background process carried our periodically by the control- 

The smallest unit of data that can be written by the ^ cr 

processor 102 is called a track, and a predetermined number THE ARRAY CONTROL UNIT 

of tracks comprise a segment. At any time, a track is live, or 55 M noted lbove> the LSAcontrol unit 108 of the preferred 

current, m on y one segment. In all other segments, the track embodimenl inchldes both , non . volalile ls A data cache 

is outdated, .bo referred to as being a dead rack. From he U8 ^ a m n , bu£fct m ^ ffl 

perspective ot the processor 1U2, a live data track is initially men , bu£fer conlajns sufficient data 

contain at least 

stored into controller memory (such as the data cache 118 or , WQ m of d fem fonM ^ ^ 

the mput memory segment wnte buffer 128) comprising a 60 g fo se £ ments J da(a ^ ^ data 

segment sO that initially ,s empty. That is, the segment ^s0 cache U8 ^ both ^ ical tracks of data r6ceived 

resides m the controller memory as the segment is filled. &om ^ processor W2 and ^ , ogical ^ read fR)m 

If a track k is being written into the segment sO of me DASD array 106 
controller memory and if the track k was previously live in 

some other DASD segment s in the DASD 106 before the 65 nc LSA Controller Operation 

write operation, then the track k becomes dead in the The controller 112 includes microcode that emulates one 

segment s and becomes live in the controller segment sO or more logical devices so that the physical nature of the 
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external storage system (the DASD array 106) is transparent 
to the processor 102. Thus, read and write requests sent from 
the processor 102 to the storage system 104 are interpreted 
and carried out in a manner that is otherwise not apparent to 
the processor. In this way, one or more logical (virtual) s 
devices are mapped onto the actual DASDs of the array 106 
by the array control unit 108. 

Because the controller 112 maintains the stored data as an 
LSA, one or more logical tracks can be stored entirely within 
a segment-column of one of the DASDs 106a, 1066, 106c, 10 
106d. Over time, the location of a logical track in the DASD 
array can change. The LSA directory 116 has an entry for 
each logical track, to indicate the current DASD location of 
each logical track. Each LSA directory entry for a logical 
track includes the logical track number, the actual DASD 15 
drive number and segment-column number within the 
DASD, the starting sector within the column at which the 
logical track starts, and the length of the logical track in 
sectors. 

When the controller 112 receives a read request for data 20 
in a logical track, it determines the logical track in which the 
data is stored, examines the LSA directory 116, and deter- 
mines the DASD number, starting sector, and length in 
sectors to which the logical track is currently mapped. The 
controller then reads the relevant sectors from the corre- 25 
sponding DASD unit of the N+l units in the array 106. 
When it receives a write request, the controller 112 first 
accumulates the data to be written in the memory segment 
buffer 128, which can store N+l segment-columns to form 
one complete segment. Each segment comprises N segment- 30 
columns of data (user information) and one segment-column 
of parity data. When the memory segment is full, a parity 
segment-column is generated by performing an exclusive- 
OR operation over all of the N data segment-columns in the 
segment. Next, the N+l segment-columns are written to an 35 
empty segment in the DASD array 106, and the LSA 
directory entries for all logical tracks that were written to 
DASD from the memory segment are updated to reflect the 
new DASD locations. 

40 

Because of the updating and deletion of logical tracks, 
gaps in the DASD segments occur. Therefore, to ensure that 
an empty segment is always available in the DASD array 
106, the array control unit 108 periodically performs the 
garbage collection process on segments in the LSA. In the 45 
garbage collection process generally, a subset of the DASD 
array segments is selected for garbage collection and DASD 
tracks in the segments are read and moved into the part of 
the memory segment buffer used to collect live tracks from 
the garbage collection process. These "live" logical tracks 5Q 
are rewritten back to DASD when the buffer is full. As a 
result, space is freed on the DASDs. The freed space is 
returned to a pool of empty segments that are available for 
data storage. 

GARBAGE COLLECTION ACCORDING TO 55 
THE INVENTION 

The invention provides a more efficient way of controlling 
and implementing the garbage collection process. In accor- 
dance with the invention, segments must wait in the DASD 60 
array for a minimum time equal to an age threshold before 
they can be considered for garbage collection. Moreover, of 
the segments that pass the age threshold value and become 
candidates for garbage collection, only those segments that 
will yield the most amount of free space are selected. As 65 
noted above, garbage collection in accordance with the 
invention is predicated on the idea that segments recently 
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filled by write operations should wait an age threshold 
amount of time before they are allowed to become candi- 
dates for garbage collection, to give the storage system a 
reasonable amount of time to rewrite the data before the 
segment is pulled out of the DASD array for garbage 
collection. That is, waiting for the age threshold is a recog- 
nition that segments in the DASD array for that time are 
unlikely to get significantly more empty due to rewrite 
operations. 

Segments are candidates for garbage collection only after 
their age passes the age threshold value. The age of a 
segment is determined with a time processor destage clock 
132 that generates a timestamp value for a segment when 
that segment is filled in the memory segment buffer 128 and 
is to be written into the DASD array 106. In particular, the 
time processor destage clock is initially set to zero. When a 
segment is filled by track writing operations from the 
processor 102 (a TW-filled segment), the timestamp asso- 
ciated with that segment is set to the current value of the 
destage clock, and the destage clock is then incremented by 
one. The timestamp value, for example, can be maintained 
in the LSA directory 116. When a segment is filled by live 
tracks taken from garbage-collected segments (a GC-filled 
segment), the timestamp associated with that segment is set 
to the largest timestamp of any segment that contributed a 
track to it during the garbage collection. In the preferred 
embodiment, the destage clock is not incremented when a 
GC-filled segment is written to the DASD array. 

The age of a segment is defined as the difference between 
the current value of the destage clock and the timestamp of 
the segment itself. Therefore, a GC-filled segment initially 
has an age equal to the age of the youngest segment that 
contributed tracks to it. For example, if the destage clock 
value is currently set to ten, and if the threshold value is set 
to four, then a segment must have a timestamp value of at 
most (10-4) or six to be old enough for garbage collection 
consideration. 

In the preferred embodiment, garbage collection in accor- 
dance with the present invention depends on a parameter, the 
age threshold value. 

The garbage collection process in accordance with the 
invention will be best understood by considering the infor- 
mation storage area in the LSA 104 as a collection of 
segments whose configuration changes from filled to empty 
and back again. FIG. 2 illustrates this characterization. 

The storage area in the DASD array 106 is organized into 
segments. These segments may be completely empty 
(represented in FIG. 2 as a pool or queue of empty segments 
202) or may contain a mixture of live data and dead data 
tracks (represented in FIG. 2 as the pool of non-empty 
segments 204). As noted above, track write operations are 
used to completely fill one segment's worth of data in the 
memory segment buffer, whose contents are then transferred 
to the next available empty DASD segment. This processing 
is represented in FIG. 2 by an empty DASD segment 
receiving one segment's worth of track write operations 206 
to become a track- write-filled (TW-filled) segment 208. The 
TW-filled segment thereby joins the ranks of the non-empty 
segments. Garbage collection processing 210 therefore is 
understood as collecting partially-filled non-empty segments 
204 and creating both completely filled segments 
(designated by the GC-filled segments path 212) and seg- 
ments completely emptied (designated by the empty seg- 
ments path 214). 

Segment Age 

Once the age of a segment s passes the age threshold 
value, the segment will pass the age threshold at all times in 
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the future, until the segment is selected during the garbage ment buffer for compaction. Lastly, shown by the box 
collection process. When a GC-filled segment s is filled with numbered 314, the LSA controller 112 moves the GC-filled 
live tracks, those tracks were selected from DASD during segments from the memory segment buffer 128 to the DASD 
garbage collection and therefore the segment immediately array. Th e LSA processing continues with setting segment 
passes the age threshold value, because the segments that s age (box 302) as write operations are conducted by the LSA 
contributed tracks to the segment must have passed the age controller. It should be understood that the flow diagram in 
threshold before they could have been selected. In effect, FIG - 3 * shown as a sequential process for illustration, and 
only the TW-filled segments must wait to pass the age the ^lions performed by different boxes might be 
threshold value before selection. An alternative, which gives performed concurrently, 
better performance in certain cases, is also to require the 10 AGE- QUEUE BUCKETS 
GC-filled segments to wait to pass the age threshold value Maintaining a list of qualified segments ordered by their 
before selection. In the pseudo-code description to follow, utilization could require excessive operating overhead, 
one or the other of these alternatives is chose by setting a because a segment will change position in the list every time 
flag. its utilization changes. An alternative, which gives a more 
As noted above, the array controller 112 selects target 15 efficient implementation, is to group segments into "buck- 
segments for garbage collection only after the segments pass ets " wne re each bucket covers a range of utilization values, 
the age threshold value, and selects segments in the order of FIG - 4 illustrates information flow in an implementation of 
smallest utilization rate. Utilization u was defined above to the P resent invention wherein segments eligible for garbage 
be the fraction of live space in the segment, so that (1-u) is collection are grouped into a collection of utilization inter- 
defined to be the fraction of free space in the segment, also ™ ™ ls ° r bucket f that ^ organized as first-in, first-out 
called the "dead" fraction. If two target segments have the £IFO queues. It was noted above that the LSA controller 
same utilization, then the controller 112 selects the oldest 112 sele u ct ^ segments that are past the age threshold accord- 
segment for garbage collection. The rationale for making m S T t0 o l A he lowe *J *ilizaUoD first Ji G * 4 u ]11 ^ at f j hat 
such a selection in the event of a tie for utilization is that J c LSA controller can perform step 310 of the FIG. 3 flow 
older segments tend to have fewer "hot" tracks (tracks 25 dl *&™ bv S r u 0U P in S e *£ lble s u e S ments mto a series of, for 
accessed repeatedly) than younger segments, and therefore example ten buckets 402, each bucket corresponding to ^ 
have less potential for decreasing utilization in the ruture. If one-tenth range of utilization. Thus one bucket 402a will be 
two segments eligible for garbage collection have the same donated for segments having utilization rates from zero to 
utilization, then the preferred embodiment first selects the 01 ,> the . next bucket 402*^ be designated for bucketswith 
oldest segment for garbage collection. 30 utilization rates greater than 0.1 and less or equal to 0.2, the 

next bucket will be for rates greater than 0.2 and less than 

PROCESSING WITHIN THE LSA ^.3, and so forth, to a bucket 402c for rates u where 

0.9<u^l. 

FIG. 3 is a flow diagram that illustrates the processing lt shou]d be understood that the queues 402 may be 

steps performed by the LSA controller 112 in managing the 35 implemented as LSA controller memory. That is, the buckets 

104. 402 m ay be included in the information of the LSA directory 

The flow diagram box numbered 302 indicates that LSA 116, so that the data comprising any selected segment is not 

operations begin with the setting of segment age when a physically moved even as the segment is "moved" within its 

segment is written from the LSA memory segment buffer respective queue or is "moved*' to a different bucket as its 

128 to the DASD array 106. Next, the garbage collection 40 utilization changes. Similarly, the waiting list 404 illustrated 

process is initiated at the flow diagram box numbered 304. in FIG. 4 is a queue in which segments are grouped as they 

Those skilled in the art will appreciate that different meth- await selection for garbage collection. Whenever the seg- 

odologies exist for determining when garbage collection ment at the head of the waiting list passes the age threshold, 

should be performed, such as the percentage of empty it is removed from the waiting list and enters the tail of the 

segments in the LSA. These methodologies do not form a 45 appropriate bucket determined by its utilization. The waiting 

part of this invention. Because the LSA controller 112 list may be implemented as a queue of segment identifiers in 

considers a segment for garbage collection only if its age is controller memory. Thus, segments do not need to be moved 

greater than the age threshold value, the next processing step physically to change their "location" in a bucket; rather, a 

is to check segment age, which is represented by the flow segment identifier or name can be moved within the respec- 

diagram box numbered 306. Each segment that is deter- 50 live buckets. 

mined to be a candidate for garbage collection is preferably p or eac h 0 f the queue buckets 402, each of the respective 

designated in some way, such as by setting a flag in the LSA member segments will have passed the age threshold value 

directory entry for that segment. Next, the utilization of each and will have the utilization corresponding to the bucket in 

segment that is older than the age threshold is determined, which they have been grouped. Segments having a utiliza- 

as represented by the flow diagram box numbered 308. 55 tion of zero are a special case and are not placed in any 

After all candidate segments are determined, the LSA bucket. If a candidate segment is to be selected for garbage 

controller 112 selects the garbage collection target segments collection, then the segment at the head of the lowest- 

in the order of smallest utilization rate, as illustrated by the numbered (lowest utilization range bucket) non-empty 

box numbered 310. That is, the segments with smaller bucket is used first. Such segments are compacted into the 

utilization rates will be consolidated in garbage collection 60 garbage collection buffer 131 (FIG. 4 and FIG. 1). Segments 

before segments with greater utilization rates. Other pro- are taken from the head of the waiting list if all buckets are 

cessing may be encompassed within the box 310 processing. empty, to avoid selection failure if all buckets are empty. An 

For example, the LSA controller will select the older of two exemplary number of queue buckets is ten; a much smaller 

segments if any two segments have equal utilization rates. number will not sufficiently pick segments with smaller 

The next step of the garbage collection process is rep re- 65 utilization values, and a much larger number (such as one 

sen ted by the flow diagram box numbered 312, which shows hundred) might require operating overhead such that it will 

that the LSA controller 112 moves segments into the seg- not be sufficiently efficient. 
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Garbage collection proceeds whenever the value of a 
garbage collection flag (GC-flag) is set to "on". The process 
that sets the GC-flag is external to the invention. For 
example, the process might set GC-flag to "on" when the 
number of empty segments falls below a certain threshold, 
and set GC-flag to "off' when the number of empty seg- 
ments reaches another (larger) threshold. The operation also 
depends on the value of a GC-wait-flag, which determines 
whether both GC-filled and TW-filled segments must wait to 
pass the age threshold, or whether only TW-filled segments 
must wait. That is, if the controller 112 (FIG. 2) detects that 
the GC-wait-flag has a value of "true", then it lets GC-filled 
segments enter the waiting list, just as do TW-filled seg- 
ments. If the GC-wait-flag has a value of "false", then 
GC-filled segments are not forced to wait, but the controller 
lets them become available for selection as soon as their 
utilization rate drops below one. It has been found that 
system operation is improved if the GC-wait-flag is set to 
"true". 

PSEUDO-CODE DESCRIPTION OF OPERATION 

In accordance with the FIG. 4 implementation of the data 
storage system, the LSA controller 112 (see FIG. 2), per- 
forms operations that can be used by control routines to 
move segments around the queue buckets. The controller 
operation will be described in terms of function calls with 
the following pseudo-code. The function calls used by the 
controller will include those listed below in Table 1: 

TABLE 1 

enqueue(s,q) a function that enters a segment s into a particular 
queue q. 

dequeue(q) a function that returns the segment name at the head of 
a queue q, and removes the named segment from 
the queue (if the queue q is empty, then this 
operation returns an "empty" value). 

remove (s) a function that removes a segment name s from 

whatever queue in which the segment is 
grouped, even if the named segment is not at the 
head of its respective queue (if the named 
segment s is not in any of the queues, then this 
operation has no effect). 

inspect -TS(q) a function that returns a times tamp value for a segment 
s where the segment s is at the head of the 
queue q, or this operation returns an "empty" 
value if the queue q is empty. 

queue(s) a function that returns the queue in which the named 

segment s is grouped {if the segment s is not in 
any queue, this operation returns the value 
"none"). 

util(s) a function that denotes the current utilization of the 

segment named s. 



40 



45 



50 



In view of the description above and the function calls from 
Table 1, the following pseudo-code of Table 2 describes the 
system operation for a given age threshold (AT) value and a 
given value of GC-wait-flag (comments are enclosed in 
diagonal slashes): 

TABLE 2 

START 

1. Set Destage clock «- 0. 

2. TS(s) *- 0 for 1 <= s <= S /S is the number of segments in ISA/ 

3. best-queue «- b /b is waiting queue/ 

LOOP: Perform steps 4, 5, 6, and 7 repeatedly and concurrently: 

4. If a TW-filled segment s is written to DASD, then 

a. TS(s) *- Destage clock. 

b. Destage clock «- Destage clock + 1. 
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TABLE 2-continued 



c. ejiqucue(s, b) 



10 



d. 



25 



30 



/put the next segment into bj 
/the waiting list queue/ 
If inspect-TS(b) => Destage clock - AT or 
if inspect-TS(b) = "empty" then stop. 

/put the next segment in the/ 
/waiting list into the proper bucket/ 

s *~ dequeue(b). 

q *- integer{b x util(s)] /get the bucket into which/ 

/the segment will go/ 
If util(s) « 1, then q «- b -1. 

enqueue(s,q) /put the segment into the bucket/ 

best-queue «- minimum[best- queue, q] /find the lowest numbered/, 
/non-empty queue or bucket/ 

Go to step 4d. 

a GC-filled segment s is written to DASD: 
TS(s) «- 0. 

If GC-wait-flag » "true", then enqueue^b), 

/if GC-filled segments/ 
else[enqueue(s, b -1)J, /should wait, then set TS =0 and/ 

/put the segment in the waiting list/ 
utilization changes for a segment s: 

If util(s) <a 0 then remove(s) and stop, /process empty segment/ 
If queue(s) » b then stop, /segment is in waiting list;/ 
/wait for action/ 

q — integer [b x utiifs)]. 
If q - queue(s) then stop. 
remove(s). 
enqueue(s,q). 

best-queue minimum[best-queue, q]. /use lowest numbered/ 

/non-empty queue/ 

GC-flag - "on" then: 
s *- dequcuefbcst-qucuc). 
If 3 is not "empty" then go to 7e. 
best- queue best-queue + 1. 
go to 7a. 

If util(s) < 1, then go to 7h. 

enqueuefs, b). /move util =1 segments to back/ 

/of waiting queue/ 

go to 7a. 

return s. /return the segment name/ 



The first three steps in Table 2 above are part of the storage 
system initialization, such as might be performed during a 
power-up stage. Steps 4, 5, 6, and 7 are repeatedly and 
concurrently performed thereafter. 

SELECTING THE AGE THRESHOLD VALUE 

Although selection of the age threshold value will depend 
to some extent on the configuration of a particular informa- 
tion storage system, two methods will next be presented for 
selecting suitable values. 

Average Segment Utilization 

The age threshold value can be selected based on average 
segment utilization information. Such system information 
can be calculated by LSA controllers automatically, so that 
processing overhead for the invention is minimized. The 
average segment utilization is defined to be: 



i 

5. if 



6. if 

a. 
b. 

c. 
d. 
e. 
f. 
S- 

If 
a. 
b. 
c. 
d. 
c. 
f. 



55 



ASU=T/CS, 

for a system with S segments, where each segment has a 
capacity of C tracks and there are T live tracks. The ASU 
value is typically a fraction less than 1. The age threshold 
value can then be calculated by using the relationship: 



AT=FxSx(l-ASU) ) 

where F is a fraction between zero and one and S is the 
number of segments in the system. An exemplary value for 
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F is one-half. Assuming the LSA controller controls the 
garbage collection scheduling, and assuming it begins gar- 
bage collection when the number of empty segments falls to 
some lower threshold value and halts when the number of 
empty segments increases to some upper threshold value, 5 
then the AT calculation becomes: 



10 



AT=FxSx{l-ASU-t(max-empty-min-empty)/(S— min-empty)]}, 

where max-empty and min-empty are maximum and mini- 
mum numbers, respectively, of empty segments. Typical 
values are max-empty -50 and min-empty- 10 for S-1000. 



Dynamic Learning Method 

The dynamic learning method selects the age threshold 
value based on system workload and makes use of the 
garbage collection utilization (GCU) system statistic defined 
as the average utilization of segments selected for garbage 
collection, averaged over a large number of segment selec- 20 
tions. The age threshold value is then adjusted according to 
whether the current GCU is better or worse than the previ- 
ously computed GCU. Smaller values of GCU are better 
than larger values. A small GCU means that the garbage 
collection algorithm is selecting segments that on average 25 
have small utilization, that is, yield a large amount of free 
space. 

The dynamic learning method operates according to three 
parameters; (1) the sample size, which is the number of 
segment selections over which the GCU is computed; (2) the 30 
adjustment step, the amount that the age threshold is 
changed at each iteration of the method; and (3) the max-AT 
value, a maximum permitted value of the age threshold. 
Generally, the sample size should be chosen large enough 
that the sample provides an accurate value for GCU using 35 
the value of age threshold that is in effect during the time that 
the sample is taken. 

Initially, the dynamic learning method begins with an age 
threshold value of zero and sets a direction parameter to 4Q 
"up" or positive. Next, the method measures the GCU over 
a predetermined period of time sufficiently long to provide 
.reliable data. The age threshold value is then increased or 
decreased depending on the value of the direction parameter, 
positive or negative, whereupon the GCU is recalculated 45 
over another period. If the GCU gets worse (increases) over 
the recalculation period, then the direction parameter is 
reversed, for example from positive to negative. The direc- 
tion otherwise is unchanged. The GCU is again calculated, 
and the process repeats. 5Q 

The dynamic learning algorithm can be understood in 
conjunction with the following pseudo-code method steps 
Table 3: 



TABLE 3 



AT «- 0; Direction «- 1; Old-sum *- oo. 
Sum «- 0; Count «- 0. 

Whenever a segment name s is chosen for garbage collection, 
or whenever util(s) decreases to 0 as a result of track writing; 

Sum «- Sum + util(s); 

Count *- Count + 1. 
If Count < sample-size, then go to Step 3. 
If Sum > Old-sum then Direction « — Direction. 
AT «- AT + (Direction x Adjustment). 
If AT < zero, then {AT «- 0; Direction *- 1; 

Old-sum *- oo; go to Step 2}. 
If AT > max-AT, then {AT «- max-AT, Direction *- -1; 

Old-sum — co; go to Step 2}. 
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TABLE 3-continued 



10. Old-sum — Sum. 

11. Go to Step 2. 



Under a workload for which GCU as a function of the age 
threshold has only one local minimum, the dynamic learning 
algorithm will eventually close on a good age threshold 
value, provided that the period over which the GCU is 
calculated is sufficiently long and the amount that the age 
threshold value is increased or decreased is sufficiently 
small. 

Empirical studies are useful in determining optimal values 
for particular systems. 

EMPIRICAL RESULTS 

The performance of a garbage collection algorithm can be 
measured by its GCU value, defined above as the average 
utilization of segments selected for garbage collection. 
Smaller values of GCU are better than larger values, since 
small average utilization means that a large amount of free 
space is being produced on average. Simulation and analysis 
of a storage system shows that, as the age threshold value 
increases, the GCU rate first stays constant, then decreases 
and then in creases. FIG. 5 is an illustration of GCU as a 
function of the normalized age threshold value. The nor- 
malized age threshold is defined to be the age threshold 
divided by the number of segments. FIG. 5 was obtained 
from analysis and simulation of a system with constant 
ASU-0.8, with a "hot-and-cold" model of track writing in 
which a fraction h=0.1 of the tracks are written a fraction 
p-0.9 of the time, and where one empty segment is produced 
during each phase of garbage collection (indicated by max- 
empty- 1). The dotted line shows the result of mathematical 
analysis, and the small circles plot data points obtained from 
simulation. 

From FIG. 5, it can be seen that for a range of age 
threshold values sufficiently near zero, the selection process 
in accordance with the invention is essentially the same as 
the greedy algorithm, because the greedy algorithm will not 
select a segment based on the smallest utilization rate until 
the age of the segment has passed the age threshold value. 
Eventually, the age threshold process will "protect" a seg- 
ment that the greedy algorithm would have selected. This 
point is evident from the FIG. 5 graph at the value for which 
GCU begins to decrease. If the age threshold value is too 
small, however, the young segments will be collected too 
soon, before they have fulfilled their potential for rapidly 
decreasing utilization. As the age threshold value continues 
to rise, eventually a point of diminishing returns is reached, 
because the age threshold process will protect too many 
low-utilization segments, with the consequence that higher 
utilization segments must be selected. 

FIG. 6 shows the GCU as a function of normalized age 
threshold value for a simulation with a less "hot" mix of 
tracks. In particular, FIG. 5 is a graph for a situation where 
h=0.1 and p=0.9, while FIG. 6 is for a situation with h=0.1 
and p=0.7; the two drawings show that the change in GCU 
as a function of age threshold value is smaller for a simu- 
lation with less "hot" tracks. 

The comparison of FIG. 5 and FIG. 6 is somewhat 
intuitive as to results, because in the case of uniform track 
choice, the GCU does not depend on the age threshold value 
unless the age threshold value is so large that an excessive 
number of low-utilization segments are kept from being 
selected because they do not pass the age threshold. 
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The result from FIG. 5 and FIG. 6, that GCU does not 
depend greatly on the age threshold value at low degrees of 
"hot" data, suggests another solution for selection of the age 
threshold value. The solution is to choose an age threshold 
value based on a high degree of "hot" data. If the degree of 5 
hotness is high, the age threshold value should be close to a 
true optimal value. If the degree of data hotness is low, then 
the selection of age threshold value is not critical. Thus, it 
should be clear from FIG. 5 and FIG. 6 that the age threshold 
value should be selected at the minimum point of the 10 
respective graphs. 

FIG. 5 and FIG, 6 provided simulation data where the 
system permitted at most one empty segment (max-empty= 
1) after the initial segment filling process is completed. That 
is, repeatedly and alternately, the LSA controller creates one 15 
empty segment and fills the empty segment by track write 
operations. That is, this situation is where a garbage collec- 
tion process and the track writing operations are conducted 
in parallel and in equilibrium. 

It is also possible to simulate, and is more realistic, to 20 
consider the situation where the number of empty segments 
produced during each phase of garbage collection is greater 
than one and is some fraction m of the number of segments. 
FIG. 7 and FIG. 8 show the analysis and simulation for this 
condition. The cases of FIG. 7 and FIG. 8 differ from the 25 
case of FIG. 5 only in that max-empty-1 in FIG. 5, whereas 
in FIG. 7 moO.05 and in FIG. 8 m=0.01. 

Comparing FIG. 7 with FIG. 5, it is clear that the optimal 
normalized age threshold value decreases by about the value 
of m when compared to the max-empty=l case. That is, with 30 
max-empty-1, the optimal normalized age threshold value is 
0.196, whereas with m=0.05 (FIG. 7), the optimal normal- 
ized age threshold value is 0.145. Similarly, comparing FIG. 
8 and FIG. 5, with m=0.01 (FIG. 8), the optimal normalized 
age threshold value is 0.186, whereas with max-empty=l, 35 
the optimal value is 0.196. 

ADVANTAGES OF THE INVENTION 
Thus, an information storage system selects target seg- 
ments for garbage collection only if their age in the infor- 40 
mation storage system exceeds an age threshold value and, 
once past the age threshold, in the order of least utilized 
target segments first. The system determines the age of a 
segment by determining the amount of time a segment has 
been located in DASD of the information storage system, 45 
and then considers a segment for garbage collection only 
after the segment has been located in the DASD for the 
selected age threshold value, and then finally chooses one or 
more of the considered segments for garbage collection in 
the order in which they will yield the maximized, most free 50 
space. In this way, efficiency of garbage collection is 
increased with minimal overhead for the information storage 
system. 

The present invention has been described above in terms 
of presently preferred embodiments so that an understanding 55 
of the present invention can be conveyed. There are, 
however, many configurations for disk storage systems and 
servo control systems not specifically described herein but 
with which the present invention is applicable. The present 
invention should therefore not be seen as limited to the 60 
particular embodiments described herein, but rather, it 
should be understood that the present invention has wide 
applicability with respect to log-structured storage systems 
generally. All modifications, variations, or equivalent 
arrangements that are within the scope of the attached claims 65 
should therefore be considered to be within the scope of the 
invention. 



We claim: 

1. A method for performing a garbage collection process 
in an information storage system having direct access stor- 
age units in which information segments are located, the 
method comprising the steps of: 

selecting an age threshold value; 

determining an age value for each segment that indicates 
the time that segment has been located in a direct access 
storage device and designating each segment as a 
candidate for garbage collection if the segment has an 
age value greater than the age threshold value; and 

choosing a candidate segment for garbage collection if it 
will yield a maximized amount of free space. 

2. A method as defined in claim 1, wherein the maximized 
amount of free space is the amount of empty storage space 
provided by performing garbage collection on the candidate 
segment having the lowest utilization. 

3. A method as defined in claim 1, wherein the step of 
choosing comprises selecting a candidate segment in accor- 
dance with a yield ranking. 

4. A method as defined in claim 3, wherein the relative 
yield ranking comprises a ranking of the candidate segments 
according to utilization. 

5. A method as defined in claim 4, wherein the relative 
yield ranking comprises a plurality of utilization intervals. 

6. A method as defined in claim 3, wherein the step of 
choosing further comprises selecting between two candidate 
segments having equal yield ranking by selecting the can- 
didate segment having the greater age value. 

7. A method as defined in claim 3, wherein the informa- 
tion segments comprise a plurality of information tracks, 
and each segment is assigned an age when written from a 
memory buffer into the direct access storage devices during 
a destage operation. 

8. A method as defined in claim 7, wherein the age of a 
segment filled by garbage collection is set to the age of the 
youngest segment that contributed tracks to the filled seg- 
ment. 

9. A method as defined in claim 7, wherein the age of a 
segment is the difference between a current destage clock 
value and the destage operation destage clock value of the 
segment. 

10. A method for managing storage of information seg- 
ments in a computer processing system that stores informa- 
tion in a plurality of direct access storage devices, the 
method comprising the steps of: 

setting the age of a segment filled by track writing 
operations to a current value of a destage clock; 

placing the filled segment at a tail position of a first-in, 
first-out (FIFO) queue; 

designating a segment from a head position of the FIFO 
queue as a garbage collection candidate if the age of the 
segment is greater than an age threshold value; 

ordering a plurality of designated candidate segments in 
accordance with their respective free space yield upon 
garbage collection; and 

choosing candidate segments for performing a garbage 
collection process in the order of their relative yield 
ranking such that candidate segments with lower yield 
rankings are selected before candidate segments with 
higher yield rankings. 

11. A method as defined in claim 10, wherein the relative 
yield ranking comprises a ranking of the candidate segments 
according to utilization. 

12. A method as defined in claim 11, wherein the relative 
yield ranking comprises a plurality of utilization intervals. 
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13. A method as defined in claim 10, wherein the step of between two candidate segments having equal yield ranking 
choosing further comprises selecting between two candidate by selecting the candidate segment having the greater age 
segments having equal yield ranking by selecting the can- value. 

didate segment having the greater age value. 23- A method as defined in claim 19, wherein the infor- 

14. A method as defined in claim 10, wherein the infor- 5 nation segments comprise a plurality of information tracks, 
mation segments comprise a plurality of information tracks, and each segment is assigned an age when written from a 
and each segment is assigned an age when written from a memory buffer into the direct access storage devices during 
memory buffer into the direct access storage devices during a destage operation. 

a destage operation ^ method as defined in claim 23, wherein the age of 

15. A method as defined in claim 14, wherein the age of 10 ^ segment filled by garbage collection is set to the age of the 

. en j i_ 11 • * * .1. f,, youngest segment that contnbuted tracks to the filled see- 

a segment filled by garbage collection is set to the age of the m ent 

youngest segment that contributed tracks to the filled seg- m6 2 D 5 ; A method as defined in daim 23 _ wherein ^ age of 

men ! - t , ^ a segment is the difference between a current destage clock 

16. A method as defined in claim 14, wherein the age of vahxc and the desUge opcration dcstage clock va]ue of the 

a segment is the difference between a current destage clock 15 segment. 

value and the destage operation destage clock value of the 2 6. A method for determining when a filled segment in a 

segment. log-structured file information storage system should be 

17. A method for performing a garbage collection process subjected to a garbage collection process, the method com- 
in a computer processing system that stores information prising the steps of: 

segments, the method comprising the steps of: 20 calculating an age threshold value defined by FxSx{l- 

setting an age threshold value to an initial value of zero; ASU-[(max-empty-min-empty)/(S-min-empty)]}, 

selecting a garbage collection direction to an up value; where: 0<«F<=1,ASU= average segment utilization, 

determining an initial garbage collection utilization max-empty is a maximum number of empty segments, 

/rprn < i i j i mtn-empty is a minimum number of empty segments, 

(GCU) measurement value over a predetermined and g £ ? he nmnbQT q{ lg {u ^ 

amount of time; storage system; and 

determining an initial age threshold value by performing responding to a garbage collection command by perform- 

the steps of: m g garbage collection on partially-filled segments of 

adjusting the age threshold value by increasing the age the computer processing system according to the steps 

threshold value if the garbage collection direction 3(J of: 

has an up value, and decreasing the age threshold determining an age value for each segment that indi- 

value if the garbage collection direction has a down cates the time that segment has been located in a 

value, direct access storage device and designating each 

calculating the GCU value over the time since the last segment as a candidate for garbage collection if the 

determined GCU value, and 35 segment has an age value greater than the age 

selecting the garbage collection direction to be the threshold value; and 

opposite of its current value if the calculated GCU choosing a candidate segment for garbage collection if 

value is worse than the last determined GCU value il wiu y ield a maximized amount of free space, 

and maintaining the garbage collection direction at 21 ' A method ,«? defined in claim 26 > wherein the m ™' 

its current value if the calculated GCU value other- An mized amount ot free space is the amount of empty storage 

wise* and space provided by performing garbage collection on the 

' candidate segment having the lowest utilization, 

responding to a garbage collection command by perform- 2 8. A method as defined in claim 26, wherein the step of 

ing garbage collection on partially-filled segments of choosing a candidate segment comprises selecting a candi- 

the computer processing system according to the steps date segment in accordance with a yield ranking. 

45 29. A method as defined in claim 28, wherein the relative 

determining an age value for each segment that indi- y ; e i d ranking comprises a ranking of the candidate segments 

cates the time that segment has been located in a according to utilization. 

direct access storage device and designating each 30. A method as defined in claim 29, wherein the relative 

segment as a candidate for garbage collection if the y i c j d ran king comprises a plurality of utilization intervals, 

segment has an age value greater than the age 50 31. A method as defined in claim 28, wherein the step of 

threshold value; and choosing a candidate segment further comprises selecting 

choosing a candidate segment for garbage collection if it between two candidate segments having equal yield ranking 

will yield a maximized amount of free space. by selecting the candidate segment having the greater age 

18. A method as defined in claim 17, wherein the maxi- value. 

mized amount of free space is the amount of empty storage 55 32. A method as defined in claim 28, wherein the infor- 

space provided by performing garbage collection on the mation segments comprise a plurality of information tracks, 

candidate segment having the lowest utilization. and each segment is assigned an age when written from a 

19. A method as defined in claim 17, wherein the step of memory buffer into the direct access storage devices during 
choosing a candidate segment comprises selecting a candi- a destage operation. 

date segment in accordance with a yield ranking. 60 33. A method as defined in claim 32, wherein the age of 

20. A method as defined in claim 19, wherein the relative a segment filled by garbage collection is set to the age of the 
yield ranking comprises a ranking of the candidate segments youngest segment that contributed tracks to the filled seg- 
according to utilization. ment. 

21. A method as defined in claim 20, wherein the relative 34, A method as defined in claim 32, wherein the age of 
yield ranking comprises a plurality of utilization intervals. 65 a segment is the difference between a current destage clock 

22. A method as defined in claim 19, wherein the step of value and the destage operation destage clock value of the 
choosing a candidate segment further comprises selecting segment. 
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35. An information storage system comprising: 38. A system as defined in claim 37, wherein the relative 
a central processing unit; yield ranking comprises a ranking of the candidate segments 
a plurality of direct access storage devices (DASD) in according to utilization. 

which segments of information are stored for use by the 39. A system as defined in claim 38, wherein the relative 

central processing unit; and s yield ranking comprises a plurality of utilization intervals, 

an information storage system controller that determines 40. Asystem as defined in claim 37, wherein the controller 

the DASD locations in which the information segments further performs the step of choosing by selecting between 

will be stored, wherein the controller manager periodi- two candidate segments having equal yield ranking by 

caily performs a garbage collection process for forming 1Q selecting the candidate segment having the greater age 

empty segments by performing the steps of selecting an value 

age threshold value, 41 A system ^ defined in claim 37> herein the infor- 

determimng an age value for each segment that rndi- madon n(s rise a luralit of informatioD tracks, 

cates the time that segment has been located m a aQd each m {& ^ d an wheQ ffom a 

direct access storage device and designating each „ f L £ • * *i_ j- * * j * j • 

? . „ 15 memory buffer into the direct access storage devices dunne 

segment as a candidate for garbage collection if the J 6 6 

segment has an age value greater than the age a /^ ta k se °P eratlon - 

threshold value and 42. Asystem as defined in claim 41, wherein the age of a 

choosing a candidate segment for garbage collection if se S mcQt fiUed b * S arba S c collection is set to the age of the 

it will yield a maximized amount of free space. 1Q youngest segment that contributed tracks to the filled seg- 

36. A system as defined in claim 35, wherein the maxi- ment. 

mized amount of free space is the amount of empty storage 43. A system as defined in claim 41, wherein the age of a 

space provided by performing garbage collection on the segment is the difference between a current destage clock 

candidate segment having the lowest utilization. value and the destage operation destage clock value of the 

37. Asystem as defined in claim 35, wherein the controller 25 segment, 
performs the step of choosing by selecting a candidate 

segment in accordance with a yield ranking. ***** 
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